Building a R Package

From Scripts to Standardized Open-Source Tools

David Munoz Tord - Senior R Developer

Cytel FSP - Janssen

August 19, 2025

Life in Pharma: Tables, Tables Everywhere

junco as case study

Challenges in Pharmaceutical Programming

The pharmaceutical industry faces unique challenges in clinical and statistical programming:

  • Implementing patterns used across many table shells (developed by dedicated teams)
  • Company-specific statistical methods that need to be standardized
  • Complex table structures that must be consistent across studies
  • Need for a core framework that ensures all company shells can be created consistently

Shells Are Table Specific, Table Creation Is Not

The Junco Package Approach

Our solution was to develop our own business logic framework for J&J table creation

Getting to Production

The Junco package provides key features needed for production-ready clinical tables:

  • True-type font support: Word wrapping, pagination, and RTF export
  • Higher order column counts: Utilities for spanning column headers
  • Guaranteed pathability in row space
  • Nearest-value (SAS-like) rounding support: Maintaining consistency with existing processes
  • Statistical calculations: In accord with business logic
  • Robust approach for risk diff columns

Creating the package

From Scripts to Standardized Tools

Industry Benefits

Creating internal R packages helps pharmaceutical companies:

  • Maintain consistency across studies and therapeutic areas
  • Reduce time spent on repetitive coding tasks
  • Improve compliance with regulatory standards
  • Facilitate knowledge transfer between team members

Code Design and API for Users

Once we had the core functionality working, we needed to focus on:

  • User-friendly API: Creating intuitive functions that match users’ mental models
  • Consistent interfaces: Ensuring functions work together seamlessly
  • Clear documentation: Making the package accessible to non-developers
  • Comprehensive testing: Validating functionality across different scenarios
  • Continuous integration: Automating quality checks and deployment
  • Version control: Managing changes and updates systematically

Documentation: Making Your Package Accessible

Using roxygen2 and pkgdown

Documentation is critical for package adoption and proper use:

  1. roxygen2 for function documentation:
#' Calculate risk difference with confidence intervals
#'
#' @param group1 Vector of outcomes for first group
#' @param group2 Vector of outcomes for second group
#' @param conf.level Confidence level (default: 0.95)
#' @return A list with risk difference and Cis
#' @export
risk_diff <- function(group1, group2, conf.level = 0.95) {
 # Function implementation
}
  1. pkgdown for website generation:
usethis::use_pkgdown()
pkgdown::build_site()

Ensuring Package Quality

Unit Testing and Validation

Quality assurance is essential for pharmaceutical applications:

  • Unit tests: Verify each function works as expected
  • Integration tests: Ensure components work together correctly
  • Regression tests: Prevent reintroduction of fixed bugs
  • Validation documentation: Meet regulatory requirements

Tip for Pharmaceutical Validation

Include detailed information about validation status and intended use cases to help users understand if the package meets their regulatory requirements.

Continuous Integration and Deployment

Automating Quality Checks

CI/CD reduces manual work while improving quality:

  • Automated testing: Run tests on every code change
  • Code coverage: Ensure comprehensive test coverage
  • Style checking: Maintain consistent coding standards
  • Documentation building: Keep documentation in sync with code

Dependency Management

Balancing Functionality and Stability

Managing dependencies is critical for long-term maintenance:

  • Minimize external dependencies when possible
  • Specify version requirements for critical dependencies
  • Consider internal functions instead of importing from less stable packages
  • Document dependency rationale for future maintainers

Advanced Topics

Version Control and Collaboration

Best Practices for Team Development

Effective collaboration requires good processes:

  • Semantic versioning: Major.Minor.Patch format (junco v0.1.1)
  • Git branching strategy: Feature branches and pull requests
  • Code review: Ensure quality and knowledge sharing
  • Issue tracking: Document bugs and feature requests

Internal vs. CRAN Packages

Choosing the Right Distribution Method

Different distribution methods serve different needs:

  • Internal packages: Company-specific methods, proprietary algorithms
  • CRAN packages: General-purpose tools, widely applicable methods
  • GitHub packages: Community collaboration, rapid development

Pharmaceutical Considerations

Internal packages often contain proprietary methods or company-specific workflows that shouldn’t be publicly shared, while more general statistical methods may benefit from community review through CRAN or GitHub distribution.

Case Study: Clinical Trial Analysis Package

From Scripts to Package

Example transformation of repetitive clinical trial analysis scripts:

  1. Identify common functions across multiple studies
  2. Standardize input/output formats
  3. Create consistent documentation
  4. Implement validation tests
  5. Deploy to internal repository

Summary

Key Takeaways

  • R packages provide a structured framework for organizing code
  • They improve reproducibility and reduce errors
  • Documentation and testing are essential components
  • Packages can streamline workflows in pharmaceutical research
  • The investment in package development pays off through reuse and reliability

Demo: Creating Complex Tables with Junco

Table Creation Script

library(junco)
library(dplyr)
library(pharmaverseadamjnj)

ADEG <- pharmaverseadamjnj::adeg |>
  select(STUDYID, USUBJID, TRT01A, PARAM, AVISIT, AVAL, CHG) |>
  filter(PARAM == "ECG Mean Heart Rate (beats/min)") |>

  mutate(colspan_trt = factor(
    if_else(TRT01A == "Placebo", " ", "Active Study Agent"),
    levels = c("Active Study Agent", " ")
  )) |>

  mutate(rrisk_header = "Risk Difference (%) (95% CI)") |>
  mutate(rrisk_label = paste(TRT01A, paste("vs", "Placebo")))

colspan_trt_map <- create_colspan_map(ADEG,
  non_active_grp = "Placebo",
  non_active_grp_span_lbl = " ",
  active_grp_span_lbl = "Active Study Agent",
  colspan_var = "colspan_trt",
  trt_var = "TRT01A"
)
ref_path <- c("colspan_trt", " ", "TRT01A", "Placebo")

lyt <- basic_table() |>
  split_cols_by(
    "colspan_trt",
    split_fun = trim_levels_to_map(map = colspan_trt_map)
  ) |>
  split_cols_by("TRT01A") |>
  split_rows_by(
    "PARAM",
    label_pos = "topleft",
    split_label = "Blood Pressure",
    section_div = " ",
    split_fun = drop_split_levels
  ) |>
  split_rows_by(
    "AVISIT",
    label_pos = "topleft",
    split_label = "Study Visit",
    split_fun = drop_split_levels,
    child_labels = "hidden"
  ) |>
  split_cols_by_multivar(
    c("AVAL", "AVAL", "CHG"),
    varlabels = c("n/N (%)", "Mean (CI)", "CFB (CI)")
  ) |>
  split_cols_by("rrisk_header", nested = FALSE) |>
  split_cols_by(
    "TRT01A",
    split_fun = remove_split_levels("Placebo"),
    labels_var = "rrisk_label"
  ) |>
  split_cols_by_multivar(c("CHG"), varlabels = c(" ")) |>
  analyze("STUDYID",
    afun = a_summarize_aval_chg_diff_j,
    extra_args = list(
      format_na_str = "-", d = 0,
      ref_path = ref_path, variables = list(arm = "TRT01A", covariates = NULL)
    )
  )

result <- build_table(lyt, ADEG)

Rendered Table Output

rtables.officer::tt_to_flextable(result)

Active Study Agent

Risk Difference (%) (95% CI)

Blood Pressure

Apalutamide

Apalutamide Subgroup

Placebo

Apalutamide vs Placebo

Apalutamide Subgroup vs Placebo

Study Visit

n/N (%)

Mean (CI)

CFB (CI)

n/N (%)

Mean (CI)

CFB (CI)

n/N (%)

Mean (CI)

CFB (CI)

ECG Mean Heart Rate (beats/min)

Baseline

72/72 (100.0%)

319.5 (281.0, 358.0)

96/96 (100.0%)

313.6 (276.9, 350.3)

86/86 (100.0%)

258.2 (223.5, 292.9)

Month 1

72/72 (100.0%)

252.3 (208.0, 296.6)

-67.2 (-119.9, -14.4)

94/94 (100.0%)

286.0 (247.1, 324.9)

-30.5 (-82.8, 21.7)

84/84 (100.0%)

291.9 (253.4, 330.3)

34.0 (-16.0, 84.0)

-101.2 (-173.3, -29.1)

-64.5 (-136.3, 7.3)

Month 3

72/72 (100.0%)

306.7 (262.3, 351.1)

-12.8 (-75.5, 50.0)

73/73 (100.0%)

311.3 (268.5, 354.2)

2.4 (-58.8, 63.5)

82/82 (100.0%)

283.8 (245.9, 321.6)

24.6 (-25.1, 74.4)

-37.4 (-116.9, 42.1)

-22.2 (-100.4, 56.0)

Month 6

68/68 (100.0%)

273.5 (234.4, 312.5)

-42.0 (-103.1, 19.1)

65/65 (100.0%)

281.5 (239.0, 324.0)

-34.9 (-95.9, 26.2)

76/76 (100.0%)

303.8 (261.4, 346.2)

40.7 (-19.8, 101.2)

-82.7 (-167.9, 2.6)

-75.6 (-160.7, 9.6)

Month 9

56/56 (100.0%)

277.6 (233.0, 322.1)

-33.1 (-106.9, 40.7)

60/60 (100.0%)

312.4 (263.6, 361.2)

2.9 (-61.9, 67.6)

73/73 (100.0%)

310.9 (269.1, 352.6)

50.1 (-12.6, 112.8)

-83.2 (-179.2, 12.7)

-47.3 (-136.5, 42.0)

Month 12

50/50 (100.0%)

313.8 (265.4, 362.2)

-4.9 (-62.8, 52.9)

52/52 (100.0%)

324.7 (273.6, 375.8)

21.9 (-46.0, 89.8)

69/69 (100.0%)

319.8 (277.6, 362.0)

59.5 (7.3, 111.7)

-64.5 (-141.6, 12.6)

-37.6 (-122.4, 47.2)

Month 15

37/37 (100.0%)

300.9 (245.4, 356.4)

-19.9 (-96.0, 56.1)

42/42 (100.0%)

279.9 (228.2, 331.5)

-13.2 (-92.8, 66.5)

68/68 (100.0%)

291.1 (251.0, 331.2)

29.4 (-32.9, 91.7)

-49.3 (-146.4, 47.7)

-42.6 (-142.5, 57.4)

Month 18

32/32 (100.0%)

313.3 (241.1, 385.5)

5.4 (-90.7, 101.5)

31/31 (100.0%)

272.7 (202.1, 343.3)

-43.9 (-146.8, 59.0)

66/66 (100.0%)

288.2 (250.1, 326.3)

24.1 (-28.6, 76.9)

-18.7 (-127.2, 89.7)

-68.0 (-182.4, 46.4)

Month 24

30/30 (100.0%)

296.0 (229.3, 362.7)

-15.3 (-118.6, 88.0)

27/27 (100.0%)

373.7 (304.7, 442.6)

61.4 (-35.7, 158.6)

59/59 (100.0%)

298.7 (254.3, 343.1)

33.6 (-25.8, 93.0)

-48.9 (-166.6, 68.8)

27.8 (-84.4, 140.1)

Resources

Q&A

Thank You for Your Attention!

Have questions about R package development or the Junco package?

Feel free to reach out using the contact information provided.

Contact Information